✨ Static Site SDK: UX improvements and --dry-run option #151

Robdel12 · 2025-12-22T06:21:09Z

Summary

Improves the static-site SDK with better UX, smarter defaults, and debugging tools:

Exit early with helpful message when no TDD server or API token is found (instead of continuing with 456 "client not initialized" errors)
Smart concurrency defaults based on CPU cores (min 2, max 8, using half of available cores)
Interactive progress display with ETA in TTY mode, traditional logging in CI
Fix sitemap index handling - now follows child sitemaps correctly instead of treating them as pages
Add --dry-run option to preview discovered pages without capturing screenshots

Changes

Better Error Handling

Exits early with clear instructions when no TDD server or API token is available
Shows two options: start TDD server or set VIZZLY_TOKEN

Smart Concurrency

Auto-detects optimal concurrency from CPU cores
Formula: Math.max(2, Math.min(8, Math.floor(cores / 2)))
Can still be overridden via config or CLI

Progress Display

In TTY: Single updating progress line with percentage and ETA
In CI: Traditional per-task logging
Shows completion time at end

Sitemap Index Support

Properly handles sitemap index files (sitemap of sitemaps)
Follows child sitemap references to extract actual page URLs
Fixes issue where sitemap-0.xml was being captured as a page

Dry Run Option

vizzly static-site ./dist --dry-run

Output shows:

Pages grouped by source (sitemap vs HTML scan)
Task count (pages × viewports)
Viewport details
Concurrency setting

Test plan

Run tests: npm test (154 tests passing)
Test dry-run with real static site build
Verify interactive progress in terminal
Verify CI-style output with CI=true
Test sitemap index handling

- Exit early with helpful message when no TDD server or API token found - Add smart default concurrency based on CPU cores (half cores, min 2, max 8) - Add progress with ETA during screenshot capture - Show total completion time at end of run The --concurrency flag already existed; now the default is smarter.

Sitemap: - Fix sitemap index files returning child sitemap URLs as pages - Now follows sitemap index to parse actual child sitemaps - Extracts real page URLs instead of sitemap-0.xml etc. Progress display: - Interactive terminals: single updating progress line with ETA - Non-interactive (CI): traditional per-task logging - Shows completion summary with total time and error count

- Add --dry-run flag to print discovered pages without capturing screenshots - Shows pages grouped by source (sitemap vs HTML scan) - Displays task count, viewports, and concurrency settings - Useful for debugging page discovery and include/exclude patterns Also includes tests for: - Sitemap XML parsing (standard and index formats) - Config schema validation and smart concurrency defaults - Dry-run mode behavior with various options

claude · 2025-12-22T06:25:21Z

Pull Request Review

This PR significantly improves the static-site SDK's user experience with smart defaults, better error handling, and helpful debugging tools. Overall, the implementation is solid with excellent test coverage. Here's my detailed feedback:

✅ Strengths

Code Quality

Clean, focused implementation - Each change addresses a specific UX pain point
Excellent test coverage - 784 additions include 546 lines of new tests (70% test code!)
Good separation of concerns - Dry-run logic exits early, progress display is modular
Strong error messages - Clear, actionable guidance when TDD server/token missing

UX Improvements

Smart concurrency defaults (clients/static-site/src/config-schema.js:14-17) - CPU-aware algorithm with sensible bounds (min 2, max 8) is well-designed
Interactive progress display (clients/static-site/src/tasks.js:203-212) - TTY detection with fallback to traditional logging shows good attention to different environments
Dry-run feature (clients/static-site/src/index.js:97-138) - Comprehensive preview helps users debug page discovery issues
Sitemap index support (clients/static-site/src/utils/sitemap.js:24-43) - Properly follows child sitemaps instead of treating them as pages

🔍 Issues & Concerns

1. Potential Race Condition in Progress Display (Medium Priority)

Location: clients/static-site/src/tasks.js:222-229

Issue: When running with high concurrency, multiple tasks may fail simultaneously. The writeProgress('') + process.stdout.write sequence could interleave with concurrent success messages, leading to garbled output.

Recommendation: Wrap the entire error logging block in a mutex/lock, or accumulate errors and print them after clearing the progress line once.

2. ETA Calculation May Be Inaccurate Initially (Low Priority)

Location: clients/static-site/src/tasks.js:195-200

Issue: Early in execution, avgTime can be highly variable. The first task might take 5s (cold browser), while subsequent ones take 500ms, leading to wildly inaccurate ETAs initially.

Recommendation: Consider using an exponential moving average or wait until N tasks complete before showing ETA.

3. Sitemap Child Resolution Has Security Implications (Medium Priority)

Location: clients/static-site/src/utils/sitemap.js:28-31

Issue: A malicious sitemap could contain URLs like https://evil.com/../../../etc/passwd. While join() normalizes paths, the .split('/').pop() approach is fragile.

Recommendation:

Validate that filename doesn't contain path traversal sequences
Check that the resolved childPath is actually within baseDir

4. Redundant Dynamic Imports (Low Priority)

Location: clients/static-site/src/utils/sitemap.js:17-18

Issue: Dynamic imports are already available at the top of the file. These redundant dynamic imports add overhead.

Recommendation: Import dirname and existsSync at the top of the file instead.

5. Silent Failures in Sitemap Parsing (Low Priority)

Location: clients/static-site/src/utils/sitemap.js:34-40

Issue: Silent failures make debugging difficult. Users won't know if a child sitemap failed to parse due to permissions, corrupted XML, or other issues.

Recommendation: Add optional debug logging when VIZZLY_LOG_LEVEL is set to debug.

6. getDefaultConcurrency() Performance (Low Priority)

Location: clients/static-site/src/config-schema.js:94,109

Issue: cpus().length is called every time getDefaultConcurrency() executes. While cheap, this could be cached.

Recommendation: Cache the result in a module-level constant.

💡 Suggestions

1. Consider Exponential Backoff for Failed Child Sitemaps

If a child sitemap is temporarily unavailable (network file system, etc.), might be worth retrying with backoff instead of silently skipping.

2. Add --verbose Option for Dry Run

The dry-run output is great, but for large sites with 1000+ pages, it could be overwhelming. Consider adding a verbose flag to show all pages vs just a summary.

3. Progress Bar Library for Better UX

Consider using a library like cli-progress or ora for more robust progress display handling (handles terminal resizing, concurrent writes, etc.).

📋 Test Coverage Analysis

Excellent work on testing! New test files provide comprehensive coverage:

✅ config-schema.test.js - Tests CPU-based concurrency defaults thoroughly
✅ dry-run.test.js - Covers all dry-run scenarios including edge cases
✅ sitemap.test.js - Tests both standard and index sitemap formats

Suggestion: Add edge case tests for:

Sitemap index with missing child files
Sitemap with malformed/malicious URLs
Concurrent task failures in progress display

🔒 Security Assessment

✅ No injection vulnerabilities detected
⚠️ Path traversal risk in sitemap child resolution (see Issue ♻️ Refactor server architecture with new handlers #3 above)
✅ Environment variable usage is safe
✅ No eval or dynamic code execution

🎯 Summary

Recommendation: Approve with minor changes

This is a high-quality PR that delivers real value. The UX improvements are well-thought-out and the test coverage is exemplary. Address the path traversal concern (Issue #3) before merging, and consider the race condition fix (Issue #1) for a follow-up PR.

Priority Fixes:

🔴 Add path validation to sitemap child resolution (security)
🟡 Fix potential race condition in error logging (UX)

Nice-to-haves:

Optimize getDefaultConcurrency() to cache CPU count
Add debug logging for skipped child sitemaps
Improve ETA accuracy with exponential moving average

Great work! The static-site SDK is significantly better with these changes.

- Fix race condition in progress display by queuing errors - Improve ETA accuracy: wait for 5+ samples, use recent 20 for average - Add path traversal protection for sitemap child resolution - Move dynamic imports to top level in sitemap.js - Cache CPU count for concurrency default - Add security tests for sitemap path traversal

Robdel12 added 3 commits December 21, 2025 23:54

This comment has been minimized.

Sign in to view

Robdel12 merged commit 4ad26cb into main Dec 22, 2025
18 checks passed

Robdel12 deleted the rd/static-site branch December 22, 2025 06:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨ Static Site SDK: UX improvements and --dry-run option #151

✨ Static Site SDK: UX improvements and --dry-run option #151

Uh oh!

Robdel12 commented Dec 22, 2025

Uh oh!

This comment has been minimized.

claude bot commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✨ Static Site SDK: UX improvements and --dry-run option #151

✨ Static Site SDK: UX improvements and --dry-run option #151

Uh oh!

Conversation

Robdel12 commented Dec 22, 2025

Summary

Changes

Better Error Handling

Smart Concurrency

Progress Display

Sitemap Index Support

Dry Run Option

Test plan

Uh oh!

This comment has been minimized.

claude bot commented Dec 22, 2025

Pull Request Review

✅ Strengths

Code Quality

UX Improvements

🔍 Issues & Concerns

1. Potential Race Condition in Progress Display (Medium Priority)

2. ETA Calculation May Be Inaccurate Initially (Low Priority)

3. Sitemap Child Resolution Has Security Implications (Medium Priority)

4. Redundant Dynamic Imports (Low Priority)

5. Silent Failures in Sitemap Parsing (Low Priority)

6. getDefaultConcurrency() Performance (Low Priority)

💡 Suggestions

1. Consider Exponential Backoff for Failed Child Sitemaps

2. Add --verbose Option for Dry Run

3. Progress Bar Library for Better UX

📋 Test Coverage Analysis

🔒 Security Assessment

🎯 Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants